Genomic analysis of Coccomyxa viridis, a common low-abundance alga associated with lichen symbioses

Lichen symbiosis is centered around a relationship between a fungus and a photosynthetic microbe, usually a green alga. In addition to their main photosynthetic partner (the photobiont), lichen symbioses can contain additional algae present in low abundance. The biology of these algae and the way they interact with the rest of lichen symbionts remains largely unknown. Here we present the first genome sequence of a non-photobiont lichen-associated alga. Coccomyxa viridis was unexpectedly found in 12% of publicly available lichen metagenomes. With few exceptions, members of the Coccomyxa viridis clade occur in lichens as non-photobionts, potentially growing in thalli endophytically. The 45.7 Mbp genome of C. viridis was assembled into 18 near chromosome-level contigs, making it one of the most contiguous genomic assemblies for any lichen-associated algae. Comparing the C. viridis genome to its close relatives revealed the presence of traits associated with the lichen lifestyle. The genome of C. viridis provides a new resource for exploring the evolution of the lichen symbiosis, and how symbiotic lifestyles shaped evolution in green algae.

were identified as C. viridis.C. viridis was present in 53 lichen metagenomes (12% of the screened metagenomes), coming from different lichen groups, collectors, and geographic locations (Fig. 3b-g, Supplementary Table S1 online).Nearly all of these algae (88%) are likely not the main photobionts of their respective lichens, as they were detected in lichen symbioses known to have non-Coccomyxa photobionts (Supplementary Table S1 online).
The lichens that contained C. viridis originated from different substrates and included a wide variety of lichen taxonomic groups from classes Lecanoromycetes, Eurotiomycetes, Dothideomycetes, and Arthoniomycetes (Supplementary Table S1 online).The majority of these lichens possess green algae at their main photobionts, with only two exceptions that associate with cyanobacteria.Most commonly, C. viridis occurred in lichens that have Trebouxia as the main photobiont.

Is C. viridis external or internal to the symbiosis?
We washed eight samples of the X. parietina lichen and screened the resulting samples via PCR.After both gentle and aggressive washing, 75% of thalli still contained detectable C. viridis DNA (Fig. 4, Supplementary Fig. S1 online).In contrast, only one wash water sample had traces of C. viridis.Generic algal primers yielded Trebouxia sequences for all samples, confirming that Trebouxia and not C. viridis is the main photobiont of these lichens (Supplementary Table S2 online 26 .In ten contigs, we detected telomeric repeats TTT AGG G, which are typical in green algae 27 .Two contigs, cviridis_6 and cviridis_13, had telomeric sequences on both ends and likely represent complete chromosomes.The genome is estimated 97.5% complete according to BUSCO estimates and has a duplication rate of 0.2% (Fig. 5c).De novo annotation of the nuclear genome produced 11,248 gene models.Plastid and mitochondrial genomes were assembled in a single circular contig each (Fig. 5d,e).The sizes of organelle genomes-64 kbp for the mitochondria and 210 kbp for the plastid-are similar to that of C. subellipsoidea, which has 65 kbp and 175 kbp respectively 28 .Both Coccomyxa species have organelle genomes that are uncommonly large for green algae 28 .While C. subellipsoidea is also reported to have an unusually high (> 50%) GC content in its organelle genomes 28 , our strain of C. viridis had a lower GC content of 42% and 40% respectively.
Our de-novo annotation of the C. viridis genome yielded 11,248 gene models and 11,202 protein records.In comparison to other published genomes of Coccomyxa algae, C. viridis has a slightly smaller genome, but a larger predicted proteome (the predicted proteomes of C. subellipsoidea and C. pringsheimii included 10,921 and 10,022 proteins respectively 29 ; Fig. 6a).The types and number of secondary metabolism gene clusters in C. viridis were similar to the free-living C. subellipsoidea (Supplementary Table S3 online).In our functional annotations, we focused on the gene families identified by Puginier et al. 29 and Armaleo et al. 30 as connected with lichenization in green algae.Compared to other Coccomyxa species, C. viridis genome encoded comparable number of aquaporins, catalases, and domains similar to tryptophan-rich sensory protein/mitochondrial benzodiazepine receptor (TspO/MBR) (Fig. 6b)-groups of genes involved in stress response 29 .Unlike other studied Coccomyxa species, C. viridis genome did not encode any proteins from the Glycoside Hydrolase (GH) 8 family (as confirmed by both IntrePro and CAZy annotations)-a diverse family of hydrolases that includes licheninases, cellulases, chitosanases, and others.However, it encoded one protein from the GH16 family, which also contains licheninases.Most notably, C. viridis genome encoded several nitrile hydratases, which are typical in lichen photobionts 29 , yet were missing from a lichen photobiont C. pringsheimii (Fig. 6b).The signal transduction component in C.  viridis is largely comparable to its relatives (Supplementary Fig. S2 online).However, the protein kinase family (IPR000719) appears expanded in C. viridis (Supplementary Table S4 online), reminiscent of similar expansions in the lichen photobiont Asterochloris 30 .

Discussion
Here, we present evidence that green alga Coccomyxa viridis is widespread in lichens as a minor component present in addition to the main photobiont.C. viridis has been reported before from lichens with various non-Coccomyxa photobionts in several isolated reports 13,14,16,20,21,31 .Species from the C. viridis clade have been independently cultured from several lichen symbioses 13,14,16 .In addition, several amplicon metabarcoding studies of lichen algae reported small numbers of reads assigned to C. viridis 20,21,31 .Now, these reports are confirmed by our systematic screening of lichen metagenomic data.We detected C. viridis in one eighth of analyzed metagenomes.The lichen symbioses shown to contain C. viridis are quite diverse and the symbionts include representatives of both main groups of lichen photobionts (green algae and cyanobacteria) and several classes of mycobionts.
.Here, we show the C. viridis clade as a possible third independent origin of the lichen-associated lifestyle, which, however, differs from the other two.In both the C. subellipsoidea and C. simplex/C.solorinae clades, nearly all lichen-associated algae are photobionts.Our metagenomic screening, combined with the literature data, yielded only three occurrences of C. subellipsoidea and C. simplex/C.solorinae in lichens with non-Coccomyxa photobionts (Fig. 3a, Supplementary Table S1 online).In contrast, the majority of algae from the C. viridis clade were isolated from lichens with non-Coccomyxa photobionts, which suggests that they are lichen-associated non-photobiont algae.Several exceptions exist, as the C. viridis clade includes photobionts of several Micarea lichens 34 and the photobiont of Schizoxylon albescens, an unusual lichen whose mycobiont is optionally lichenized and can occur as a nonsymbiotic saprotroph 35 , plus a few strains with non-lichen ecologies, including a mussel parasite 36 .Overall, the fact that C. viridis can occur in lichens as either a photobiont or a non-photobiont is consistent with prior reports showing 'additional' algae in lichen thalli to be photobionts of unrelated lichens 22 .However, our results suggest that C. viridis, unlike other lichen-associated Coccomyxa species, primarily occurs in lichens as a non-photobiont.
How tightly is C. viridis associated with lichen symbioses?From the existing data we cannot determine how frequently C. viridis occurs outside of lichens, and therefore we cannot exclude the chance that C. viridis is a cosmopolitan alga so common that its presence in lichens is a mere coincidence.At the same time, the majority of existing C. viridis isolates originate from lichen material, with non-lichen ecology being comparatively rare.This, combined with the fact that C. viridis has been found in a wide variety of lichens from different substrates and continents, suggests some degree of association with lichens.Based on the available evidence, we hypothesize that C. viridis can exist both as a free-living and lichen-associates alga, but more commonly occurs in a lichen context, as is the case with other lichen algae, including the most common lichen photobiont Trebouxia 22 .
The newly sequenced genome of C. viridis is the first genome of a non-photobiont lichen-associated alga and one of the first near chromosome-level assemblies of any lichen-associated algae.It is also the fourth genome from Coccomyxa, in addition to free-living strains of C. subellipsoidea 26 and C. viridis 25 and the lichen photobiont C. pringsheimii (part of the C. simplex/C.solorinae clade) 29 .By comparing the three available genomes coming from different clades and different lifestyles, we showed that they share basic genomic characteristics (the fourth genome belonging to the free-living C. viridis has not been released at the time of submission).At the same time, our results suggest that C. viridis might exhibit more traits associated with lichenization compared to others, as demonstrated by a slight expansion of the kinase family and the presence of nitrile hydratases.
What is the nature of its relationship between non-photobiont algae such as C. viridis and the rest of lichen symbionts?While it is possible that non-photobiont algae only treat lichens as a substrate to attach to, they can potentially reap other benefits.For lichen photobionts, participation in the symbiosis is hypothesized to bring numerous rewards: protection from herbivory, access to nitrogen, and a better hydration regime (reviewed in 37 ).The extent to which non-photobiont algae have access to the same benefits might depend on whether they grow epiphytically on the surface of lichen thalli, or in the thallus interior.Our screening of washed lichen samples suggests that C. viridis can be endophytic, however more evidence is needed to prove this conclusively.Conversely, other lichen symbionts might benefit from the non-photobiont algae.While carbohydrates produced by a small number of C. viridis cells are unlikely to significantly alter the carbon budget of the lichen, the presence of a diverse set of algae could facilitate photobiont-switching thereby increasing plasticity of the symbiosis as a whole 22 .
This study began with an accident.Our initial culture of the photobiont of a Xanthoria lichen was overgrown by C. viridis.Perhaps not completely coincidentally, the first sequenced genome of Coccomyxa, C. subellipsoidea, was also produced by accident in a project aimed at a different alga 26 .Relatively fast growth, observed for some non-photobiont lichen-associated algae 17 , and their frequent presence make C. viridis contamination a likely problem in studies involving culturing of lichen symbionts.At the same time, C. viridis and other frequently discarded and understudied members of lichen microbiota might yet shed light on the evolution of lichen symbiosis.
Currently, we know much less about the biology and the evolution of lichen-associated algae, compared to the lichen-associated fungi.This begins to change with a recent study pioneering comparative genomics of freeliving algae and lichen photobionts 29 .We believe it can be beneficial to include non-photobiont lichen-associated algae into similar studies in the future, which will now be possible with the high-quality genome of C. viridis we provide.As we accumulate more information on the ecology of individual algal species and in what, if any, ways they engage in lichen symbioses, we will be able to chart the evolution of lichenization in green algae.

Culturing
The alga was cultured from a thallus of Xanthoria parietina lichen kindly provided by Prof. Paul Dyer, University of Nottingham, UK.The thallus was collected in the Peak District, UK.The photobiont was isolated from the thallus as previously described 38,39 .The culture was routinely grown in liquid Bold's Mineral Medium (BMM) on a 12-h night/day light cycle.

Genome sequencing and assembly
DNA was extracted from 34 mg of dry weight of algal culture, which was snap-frozen, homogenized with a Geno/Grinder homogenizer (SPEX SamplePrep, Metuchen NJ, USA) at 1300 rpm for 1 min, and extracted with the NucleoBond High Molecular Weight DNA Kit (Macherey-Nagel, Düren, Germany).The extraction yielded 16.5 μg of high-molecular weight DNA, which was used for long-read sequencing.Short fragments were removed using Circulomics Short Read Eliminator Kit (Pacific Biosciences, Menlo Park CA, USA) with 25 kbp cut-off.A sequencing library was prepared using Native Barcoding Kit 96 V14 (Oxford Nanopore Technologies, Oxford, UK).The library was sequenced on a PromethION Flow Cell FLO-PRO114M (Oxford Nanopore Technologies, Oxford, UK) to 25 Gbp of data.
In addition, we used the same DNA extraction to produce short read data.DNA was sent to Novogene UK (Cambridge, UK) and sequenced on an Illumina NovaSeq 6000 platform to 2 Gbp of PE150 data.The resulting short-read data were used to polish the long-read assembly with Pilon v1.23 41 .

Transcriptomic sequencing
We generated transcriptomic data to be used for training during the annotation of the genome.Algal culture was transferred from the liquid stock and plated on petri dishes with 99:1 BMM:MEYE culture medium.The cultures were harvested 2, 9, 21, and 42 days post inoculation, with three replicates for each time point.We snapfroze the harvested material in liquid nitrogen and extracted RNA using the RNeasy Plant Mini Kit (QIAGENE, Hilden, Germany).The RNA was sent to Novogene UK (Cambridge, UK) and sequenced on an Illumina HiSeq 2500 platform to PE150 data.

Genome annotation
Since our initial BLASTx search against NCBI-nr showed our genomic assembly to contain bacterial sequences, we used a metagenomic binning approach to filter out contamination.We aligned Illumina reads against the assembly using Bowtie2 42 and used the resulting bam file to bin the assembly with MetaBAT2 43 .Next, we used the BLASTx search to select the bin that corresponded to the target algal genome.We confirmed the genome quality with BUSCO5 44 , using the chlorophyta_odb10 database.To detect telomeric repeats, we used the script from Hiltunen et al. 45 with 'CCC TAA A' as a query.To detect contigs representing organelle genomes, we used the results from the same BLASTx search.
Organelle genomes were annotated separately.We extracted contigs identified as organelle genomes from our initial assembly and predicted genes using MFannot 62 and GeSeq 63 .To aid the annotation, we aligned RNA-seq data against the contigs identified as mitochondrial and plastid genomes using STAR v2.5.4b 64 .To finalize the annotations, we manually combined the outputs of the two tools and cross-referenced it against the RNA-seq alignment.The annotations were visualized using the OGDraw webserver 65 .

Phylogenetic analyses
To provide a taxonomic identification to the sequenced genome, we first built a phylogenomic tree using 10 reference genomes and transcriptomes from Trebouxiophyceae with Chlamydomonas eustigma as an outgroup (Supplementary Table S5 online).We identified chlorophyta_odb10 BUSCO single-copy orthologs shared by all genomes and transcriptomes, which amounted to 196 loci.Next, we created a single concatenated alignment using MAFFT v7.271 66 and trimmed it with trimAL v1.2 67 to remove positions present in < 70% of organisms.Finally, we computed a phylogeny with RAxML v8.2.12 68 , using PROTGAMMAAUTO model.To provide a better taxonomic resolution, we created a tree based on the ITS region (ITS1, 5.8S ribosomal RNA gene, ITS2).We included 77 reference ITS sequences from Coccomyxa and Elliptochloris (Supplementary Table S6 online).The tree was constructed as described above.

Screening of publicly available metagenomic data
We searched for Coccomyxa viridis ITS in the 438 metagenomic assemblies from Tagirdzhanova et al. 69 .The metagenomes were sourced from NCBI and originated from 12 different studies, and 377 lichen symbioses (Supplementary Table S7 online).The procedure for metagenomic assembly is described in Tagirdzhanova et al. 69 .Briefly, each metagenomic dataset was filtered to remove human contamination, clipped to remove adapters, and assembled separately with metaSPAdes.To screen the metagenomic assemblies, we used a BLASTn search with the e-value cut-off of 1e-65.As a query, we used the ITS region (ITS1, 5.8S ribosomal RNA gene, ITS2) pulled from the genome assembly.Extracted hits were combined with the ITS reference sequences (Supplementary Table S2 online), aligned as described above, and used to construct a phylogeny using IQ-TREE v2.2.2.2 70 with 10,000 rapid bootstraps and the TIM2 + F + I + R3 substitution model.

Figure 1 . 4 HG973003Figure 2 .
Figure 1.Algae in Xanthoria parietina.(a) Micrographs of Coccomyxa viridis cultured from a X. parietina thallus.C. viridis cells are ellipsoid and have one or several chloroplasts located near cell exterior.For comparison, the lower track shows the main photobiont of X. parietina, Trebouxia.Trebouxia cells are globular and contain one centrally located chloroplast, often star-shaped or lobed.In both cultures, cell walls were stained with the Calcofluor White (CFW) stain.Scale bar = 5 μm.(b) Thallus of X. parietina growing on a tree branch; photo courtesy of Phil Robinson.(c) Cross-section through a X. parietina thallus, showing internal structure with four layers: uc = upper cortex (formed primarily by mycobiont hyphae embedded in an extracellular matrix), al = algal layer (mycobiont hyphae and photobiont cells), me = medulla (loosely arranged mycobiont hyphae), lc = lower cortex (mycobiont hyphae embedded in an extracellular matrix).The arrow points to a cell of Trebouxia residing in the algal layer.Scale bar = 50 μm. ).

Figure 3 .
Figure 3. C. viridis presence in publicly available lichen metagenomic data.(a) Phylogenetic tree of Coccomyxa ITS sequences.Pink dots represent sequences pulled from lichen metagenomic data.Blue represents C. viridis clade and C. viridis sequences from the literature.(b) Map showing geographic locations of each lichen sample, in which we detected C. viridis by screening metagenomic data produced from this sample.All these samples were collected in North America and Europe, however the real distribution of C. viridis could be broader, given that existing metagenomic data on lichens is geographically biased towards these two continents.(c) Presence of C. viridis across lichen taxonomic groups.The tree represents the phylogeny of lichen mycobionts modified from Tagirdzhanova et al. 69 and Díaz-Escandón et al. 73 ; only taxa included in the metagenomic screening are shown.Green dots show taxonomic groups for which C. viridis was detected, with the prevalence ratios shown to the right.(d-g) Examples of lichen symbioses containing C. viridis; photos courtesy of Jason Hollinger.d.Chrysothrix xanthina (Arthoniales, Arthoniomycetes).(e) Dibaeis baeomyces (Pertusariales, Lecanoromycetes).(f) Cladonia ochrochlora (Lecanorales, Lecanoromycetes).(g) Solorina crocea (Peltigerales, Lecanoromycetes).

Figure 4 .
Figure 4. PCR-based screening for the C. viridis presence in washed lichen samples.Eight X. parietina thalli were used, of which a half were washed more gently in water, and each produced two DNA extractions: one from wash water and one from the washed thallus.The other half were washed more aggressively in ethanol and bleach; for those samples we only extracted DNA from the washed thallus.The top panel shows screening results for C. viridis-specific primers.Dark-green circles represent DNA extractions containing C. viridis DNA.Phylogenetic tree confirming the taxonomic assignment of C. viridis sequences is shown in Supplementary Fig. S1 online.The bottom panel shows screening results for generic algal rbcL primers.Light-green circles represent DNA extractions that yielded sequences of Trebouxia; white circles represent DNA extractions that did not yield a usable sequence.

Figure 5 .
photosystem assembly/stability factors RubisCO large subunit ATP synthase cytochrome b/f complex photosystem II photosystem I https://doi.org/10.1038/s41598-023-48637-w 32mparative genomics analysis of C. viridis (highlighted in red) and other Trebouxiophyceae genomes.Information for the genomes other than C. viridis is taken from Puginier et al. 29 .(a)Basicgenomestatisticsplotted by taxonomic group; the circles next to the species name represent the ecology of each strain.(b)Presence of InterProScan gene functional families across Trebouxiophyceae genomes.Here we show only functional families highlighted by Puginier et al.29as potentially relevant to lichenization in green algae.The size of the bubbles represent the number of genes assigned to each family in a given genome..viridis coes from a genus that includes many symbiotic algae and, among others, lichen photobionts32.Most Coccomyxa photobionts come from one of the two clades: C. subellipsoidea and C. simplex/C.solorinae, which led to the hypothesis that lichenization happened in Coccomyxa twice Vol:.(1234567890) Scientific Reports | (2023) 13:21285 | https://doi.org/10.1038/s41598-023-48637-wwww.nature.com/scientificreports/C